Goto

Collaborating Authors

 data compression




NEURAL: Attention-Guided Pruning for Unified Multimodal Resource-Constrained Clinical Evaluation

Joshi, Devvrat, Rekik, Islem

arXiv.org Artificial Intelligence

The rapid growth of multimodal medical imaging data presents significant storage and transmission challenges, particularly in resource-constrained clinical settings. We propose NEURAL, a novel framework that addresses this by using semantics-guided data compression. Our approach repurposes cross-attention scores between the image and its radiological report from a fine-tuned generative vision-language model to structurally prune chest X-rays, preserving only diagnostically critical regions. This process transforms the image into a highly compressed, graph representation. This unified graph-based representation fuses the pruned visual graph with a knowledge graph derived from the clinical report, creating a universal data structure that simplifies downstream modeling. Validated on the MIMIC-CXR and CheXpert Plus dataset for pneumonia detection, NEURAL achieves a 93.4-97.7\% reduction in image data size while maintaining a high diagnostic performance of 0.88-0.95 AUC, outperforming other baseline models that use uncompressed data. By creating a persistent, task-agnostic data asset, NEURAL resolves the trade-off between data size and clinical utility, enabling efficient workflows and teleradiology without sacrificing performance. Our NEURAL code is available at https://github.com/basiralab/NEURAL.


LZ Penalty: An information-theoretic repetition penalty for autoregressive language models

Ginart, Antonio A., Kodali, Naveen, Lee, Jason, Xiong, Caiming, Savarese, Silvio, Emmons, John R.

arXiv.org Artificial Intelligence

We introduce the LZ penalty, a penalty specialized for reducing degenerate repetitions in autoregressive language models without loss of capability. The penalty is based on the codelengths in the LZ77 universal lossless compression algorithm. Through the lens of the prediction-compression duality, decoding the LZ penalty has the interpretation of sampling from the residual distribution after removing the information that is highly compressible. We demonstrate the LZ penalty enables state-of-the-art open-source reasoning models to operate with greedy (temperature zero) decoding without loss of capability and without instances of degenerate repetition. Both the industry-standard frequency penalty and repetition penalty are ineffective, incurring degenerate repetition rates of up to 4%.


EdgeCodec: Onboard Lightweight High Fidelity Neural Compressor with Residual Vector Quantization

Hodo, Benjamin, Polonelli, Tommaso, Moallemi, Amirhossein, Benini, Luca, Magno, Michele

arXiv.org Artificial Intelligence

This paper has been accepted for publication at the International Workshop on Advances in Sensors and Interfaces (IW ASI), Italy, 2025. DOI: T o be added when available. Abstract -- Data Compression is a staple of data processing and storage. Sending and storing data more efficiently is an open challenge in the Internet-of-Things (IoT), with devices typically characterized by limited availability of energy and computing power . The problem tackled in this paper is the massive amounts of sensor data collected and sent uncompressed by IoT-devices. We address this issue by compressing local data using a neural network supplemented with the Residual V ector Quantization (RVQ) technique. This paper, inspired by lossy neural compressors for audio like Google Soundstream and Meta EnCodec, proposes EdgeCodec: a lightweight lossy neural compressor specifically designed to run at the edge on low-power and resource constrained Microcontroller Units (MCUs). EdgeCodec processes multi-channel data with a flexible end-to-end learnable pipeline. We evaluate EdgeCodec in a real-life challenging use case, namely wind turbine monitoring using a 40-channel barometric sensor . Under the proposed use-case, our EdgeCodec reaches a Compression Ratio (CR) between 2560 and 10240 that can be varied in real-time to tune the tradeoff between compression and reconstruction quality.


SOSAE: Self-Organizing Sparse AutoEncoder

Modi, Sarthak Ketanbhai, Lim, Zi Pong, Cao, Yushi, Cheng, Yupeng, Teo, Yon Shin, Lin, Shang-Wei

arXiv.org Artificial Intelligence

The process of tuning the size of the hidden layers for autoencoders has the benefit of providing optimally compressed representations for the input data. However, such hyper-parameter tuning process would take a lot of computation and time effort with grid search as the default option. In this paper, we introduce the Self-Organization Regularization for Autoencoders that dynamically adapts the dimensionality of the feature space to the optimal size. Inspired by physics concepts, Self-Organizing Sparse AutoEncoder (SOSAE) induces sparsity in feature space in a structured way that permits the truncation of the non-active part of the feature vector without any loss of information. This is done by penalizing the autoencoder based on the magnitude and the positional index of the feature vector dimensions, which during training constricts the feature space in both terms. Extensive experiments on various datasets show that our SOSAE can tune the feature space dimensionality up to 130 times lesser Floating-point Operations (FLOPs) than other baselines while maintaining the same quality of tuning and performance.


Highly Efficient Direct Analytics on Semantic-aware Time Series Data Compression

Sun, Guoyou, Karras, Panagiotis, Zhang, Qi

arXiv.org Artificial Intelligence

Semantic communication has emerged as a promising paradigm to tackle the challenges of massive growing data traffic and sustainable data communication. It shifts the focus from data fidelity to goal-oriented or task-oriented semantic transmission. While deep learning-based methods are commonly used for semantic encoding and decoding, they struggle with the sequential nature of time series data and high computation cost, particularly in resource-constrained IoT environments. Data compression plays a crucial role in reducing transmission and storage costs, yet traditional data compression methods fall short of the demands of goal-oriented communication systems. In this paper, we propose a novel method for direct analytics on time series data compressed by the SHRINK compression algorithm. Through experimentation using outlier detection as a case study, we show that our method outperforms baselines running on uncompressed data in multiple cases, with merely 1% difference in the worst case. Additionally, it achieves four times lower runtime on average and accesses approximately 10% of the data volume, which enables edge analytics with limited storage and computation power. These results demonstrate that our approach offers reliable, high-speed outlier detection analytics for diverse IoT applications while extracting semantics from time-series data, achieving high compression, and reducing data transmission.


Noumenal Labs White Paper: How To Build A Brain

Ramstead, Maxwell J. D., Pattisapu, Candice, Fox, Jason, Beck, Jeff

arXiv.org Artificial Intelligence

This white paper describes some of the design principles for artificial or machine intelligence that guide efforts at Noumenal Labs. These principles are drawn from both nature and from the means by which we come to represent and understand it. The end goal of research and development in this field should be to design machine intelligences that augment our understanding of the world and enhance our ability to act in it, without replacing us. In the first two sections, we examine the core motivation for our approach: resolving the grounding problem. We argue that the solution to the grounding problem rests in the design of models grounded in the world that we inhabit, not mere word models. A machine super intelligence that is capable of significantly enhancing our understanding of the human world must represent the world as we do and be capable of generating new knowledge, building on what we already know. In other words, it must be properly grounded and explicitly designed for rational, empirical inquiry, modeled after the scientific method. A primary implication of this design principle is that agents must be capable of engaging autonomously in causal physics discovery. We discuss the pragmatic implications of this approach, and in particular, the use cases in realistic 3D world modeling and multimodal, multidimensional time series analysis.


Foundation Model for Lossy Compression of Spatiotemporal Scientific Data

Li, Xiao, Lee, Jaemoon, Rangarajan, Anand, Ranka, Sanjay

arXiv.org Artificial Intelligence

We present a foundation model (FM) for lossy scientific data compression, combining a variational autoencoder (VAE) with a hyper-prior structure and a super-resolution (SR) module. The VAE framework uses hyper-priors to model latent space dependencies, enhancing compression efficiency. The SR module refines low-resolution representations into high-resolution outputs, improving reconstruction quality. By alternating between 2D and 3D convolutions, the model efficiently captures spatiotemporal correlations in scientific data while maintaining low computational cost. Experimental results demonstrate that the FM generalizes well to unseen domains and varying data shapes, achieving up to 4 times higher compression ratios than state-of-the-art methods after domain-specific fine-tuning. The SR module improves compression ratio by 30 percent compared to simple upsampling techniques. This approach significantly reduces storage and transmission costs for large-scale scientific simulations while preserving data integrity and fidelity.


Quantum Implicit Neural Compression

Fujihashi, Takuya, Koike-Akino, Toshiaki

arXiv.org Artificial Intelligence

Signal compression based on implicit neural representation (INR) is an emerging technique to represent multimedia signals with a small number of bits. While INR-based signal compression achieves high-quality reconstruction for relatively low-resolution signals, the accuracy of high-frequency details is significantly degraded with a small model. To improve the compression efficiency of INR, we introduce quantum INR (quINR), which leverages the exponentially rich expressivity of quantum neural networks for data compression. Evaluations using some benchmark datasets show that the proposed quINR-based compression could improve rate-distortion performance in image compression compared with traditional codecs and classic INR-based coding methods, up to 1.2dB gain.